Data Analytics Unit
Quarterly Orientation
June, 2023
“If we have data, let’s look at the data. If all we have are opinions, let’s go with mine.”
Jim Barksdale
The DAU
- Small but growing team
- More of an horizontal coordination
- Data focused
What do we do?
- Data collection through non-traditional methods
- Data wrangling and visualization
- Workflow developments
- Research
What do we do?
- Data collection through non-traditional methods
- Data wrangling and visualization
- Workflow developments
- Research
We are not involved in the calculation of index scores
Data collection
- Web Scraping is the process in which we collect information from the web with the objective to export it as an organized data structure that would fit our needs.
- Automatized process through scrapper bots.
- Data collected:
- EU Lawyers Data: Lawyer information from national and regional Bar Associations across 27 EU countries.
- Political News Data: Headlines, description and corpus of around 100,000 news from the political columns of 12 major newspapers in the EU.
- Geocoding is the process of transforming a description of a location into geographical coordinates such as longitude and latitude.
- Python/R APIs from online mapping services such as Google Maps or Open Street Maps.
- Implemented in the EU Lawyers Data.
Data wrangling and visualization
- Data wrangling is the process of transforming unstructured or “dirty” data into a tidy and structured version ready for analysis.
- Process is outcome specific:
- Multidisciplinary process
- Visualization should be data driven and outcome oriented.
- Outcomes worked:
- Charts
- Infographics
- Interactive dashboards
- Dynamic visualizations
Workflow developments
- Adequate tools can have a great impact on the efficiency of the team.
- We develop our own tools for:
Anything that can be automated, should be automated. Do as little as possible by hand. Do as much as possible with functions.
Hadley Wickham
How do we do it?
- Data cleaning
- Data exploration
- Intensive data cleaning
- Data Visualization
- Workflow developments
- Geospatial manipulation
- Statistical Models
- Webscrapping
- Machine Learning Models
- Natural Language Processing
- Tools development
- HTML & CSS
- Online reports
- Aesthetic manipulation
- Markdown & Quarto
- Documentation
- Presentations
- Internal reports
Thank you for your attention